3 research outputs found

    t-Exponential Memory Networks for Question-Answering Machines

    Full text link
    Recent advances in deep learning have brought to the fore models that can make multiple computational steps in the service of completing a task; these are capable of describ- ing long-term dependencies in sequential data. Novel recurrent attention models over possibly large external memory modules constitute the core mechanisms that enable these capabilities. Our work addresses learning subtler and more complex underlying temporal dynamics in language modeling tasks that deal with sparse sequential data. To this end, we improve upon these recent advances, by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network parameters as latent variables with a prior distribution imposed over them. Our statistical assumptions go beyond the standard practice of postulating Gaussian priors. Indeed, to allow for handling outliers, which are prevalent in long observed sequences of multivariate data, multivariate t-exponential distributions are imposed. On this basis, we proceed to infer corresponding posteriors; these can be used for inference and prediction at test time, in a way that accounts for the uncertainty in the available sparse training data. Specifically, to allow for our approach to best exploit the merits of the t-exponential family, our method considers a new t-divergence measure, which generalizes the concept of the Kullback-Leibler divergence. We perform an extensive experimental evaluation of our approach, using challenging language modeling benchmarks, and illustrate its superiority over existing state-of-the-art techniques

    Natural Language Processing with Deep Neural Networks

    No full text
    Since the very first days of the Computer Era, machines provided us the ability to collect and store vast amount of information. It, soon, became obvious that harvesting information was an entirely different task, much more complicated and demanding. Many solutions developed to make it possible for humans to communicate with computers, aiming data mining. Databases, query languages and search engines were for decades the most prevalent solution. At first, special skills were required such as query language knowledge or search engine’s syntax to be able to perform advanced tasks. In order to be adopted by users, search should be easy and human friendly. Nowadays, search engines are using simple language syntax and have made significant progress towards natural language path. Still, search engines fall sort on combining information from different parts producing a synthetic answer. Computers are actually computational machines, therefore are excellent at manipulating syntax and calculating words’ frequency but are weak in recognizing concepts behind the words. A traditional search engine, is not able to draw conclusions or realize the context of a dialogue. Machine Learning has been proven strong in dealing with such concepts. One of the most challenging fields of machine learning is the Natural Language Processing (NLP), especially its component Natural Language Understanding (NLU). The crest of NLP are the question-answering and summarization tasks, in sense that strong cognitive ability is required in order the conceptual context to be extracted. Supervised learning of deep neural networks, is currently the best available tool for these tasks. Despite the rapid advances in the field of Machine Learning, their performance remains poor when dealing with hard NLU and NLP tasks, such as abstractive summarisation and question answering. This dissertation aims to offer substantive and measurable progress in both these areas, by ameliorating a key problem of modern machine learning techniques: The need for dense and large data corpora for effective model training. This is an especially hard task in the context of such applications. To this end, we leverage arguments from the field of Bayesian inference. This allows for better dealing with the modelling uncertainty, which is the direct outcome of data sparsity, and results in poor modelling/generalisation performance. Our approaches are founded upon solid and elaborate statistical inference arguments, and are evaluated using challenging popular benchmarks. As we show, they offer tangible performance advantages over the state-of-the-art.Sergios Theodoridis, Nicolas TsapatsoulisComplete

    Dialog Speech Sentiment Classification for Imbalanced Datasets

    No full text
    Speech is the most common way humans express their feelings, and sentiment analysis is the use of tools such as natural language processing and computational algorithms to identify the polarity of these feelings. Even though this field has seen tremendous advancements in the last two decades, the task of effectively detecting under represented sentiments in different kinds of datasets is still a challenging task. In this paper, we use single and bi-modal analysis of short dialog utterances and gain insights on the main factors that aid in sentiment detection, particularly in the underrepresented classes, in datasets with and without inherent sentiment component. Furthermore, we propose an architecture which uses a learning rate scheduler and different monitoring criteria and provides state-of-the-art results for the SWITCHBOARD imbalanced sentiment dataset
    corecore